Web Document Clustering based on a Hierarchical Self-Organizing Model
نویسندگان
چکیده
In this work, a hierarchical self-organizing model based on the GHSOM is presented in order to cluster web contents. The GHSOM is an artificial neural network that has been widely used for data clustering. The hierarchical architecture of the GHSOM is more flexible than a single SOM since it is adapted to input data, mirroring inherent hierarchical relations among them. The adaptation process of the GHSOM architecture is controlled by two parameters. However, these parameters have to be established in advance and this task is not always easy. In this paper, a one parameter hierarchical self-organizing model is proposed. This model has been evaluated by using the ’BankSearch’ benchmark dataset. Experimental results show the good performance of this approach.
منابع مشابه
A Phrase-Based Method for Hierarchical Clustering of Web Snippets
Document clustering has been applied in web information retrieval, which facilitates users’ quick browsing by organizing retrieved results into different groups. Meanwhile, a tree-like hierarchical structure is wellsuited for organizing the retrieved results in favor of web users. In this regard, we introduce a new method for hierarchical clustering of web snippets by exploiting a phrase-based ...
متن کاملHierarchical Fuzzy Clustering Semantics (HFCS) in Web Document for Discovering Latent Semantics
This paper discusses about the future of the World Wide Web development, called Semantic Web. Undoubtedly, Web service is one of the most important services on the Internet, which has had the greatest impact on the generalization of the Internet in human societies. Internet penetration has been an effective factor in growth of the volume of information on the Web. The massive growth of informat...
متن کاملSteel Consumption Forecasting Using Nonlinear Pattern Recognition Model Based on Self-Organizing Maps
Steel consumption is a critical factor affecting pricing decisions and a key element to achieve sustainable industrial development. Forecasting future trends of steel consumption based on analysis of nonlinear patterns using artificial intelligence (AI) techniques is the main purpose of this paper. Because there are several features affecting target variable which make the analysis of relations...
متن کاملHierarchical Clustering of documents-A brief study and implementation in MATLAB
The paper discusses and implements hierarchical clustering of documents. The objective is to group similar documents together using hierarchical clustering methods. The paper aims at organizing a set of documents into clusters. The paper is focused on Web Content mining by clustering web documents. Clustering is done on document corpus in MATLAB environment. The result is groups or clusters of ...
متن کاملUsing Growing hierarchical self-organizing maps for document classification
The self-organizing map has shown to be a stable neural network model for high-dimensional data analysis. However, its applicability is limited by the fact that some knowledge about the data is required to de ne the size of the network. In this paper we present the Growing Hierarchical SOM. This dynamically growing architecture evolves into a hierarchical structure of self-organizing maps accor...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010